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CU ABSTRACT 

We present the first results from an ongoing survey for Damped Lyman-o' systems (DLAs) in the spectra of z > 2 quasars observed 
in the course of the Baryon Oscillation Spectroscopic Survey (BOSS), which is part of the Sloan Digital Sky Survey (SDSS) III. 
Our full (non-statistical) sample, based on Data Release 9, comprises 12,081 systems with log/V(Hl) > 20, out of which 6,839 have 
logA'(Hl) > 20.3. This is the largest DLA sample ever compiled, superseding that from SDSS-II by a factor of seven. 
Using a statistical sub-sample and estimating systematics from realistic mock data, we probe the A'(Hl) distribution at (z) = 2.5. 
Contrary to what is generally believed, the distribution extends beyond 10 22 cirr 2 with a moderate slope of index « -3.5. This result 
matches surprisingly well the opacity-corrected distribution observed at z = 0. The cosmological mass density of neutral gas in DLAs 
fy") , is found to be Qj? LA as 10~ 3 , evolving only mildly over the past 12 billion years. 

, Key words, cosmology: observations - quasar: absorption-lines - galaxies:evolution 
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1 . Introduction Digita l Sky Survey (SDSS.lYork et al.ll2000t) bvlProchaska et al l 
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Studying the distribution of neutral gas in and around galaxies hereafter ^0^7 These studies indicate that the /V(H I) distribu- 

at different cosmological times provides a wealth of information t ion function (f(jV HT ,y), where x is the absorption distance, see 

about the formation and evolution of galaxies. The 21-cm hy- iLanzetta et al.l 11991) steepens at logJV(Hi) > 21 and that the 

perfine emission of atomic hydrogen has been used to trace the cosmological density of neutral gas contained in DLAs (Q|? LA ) 

neutral gas in nearby galaxies and estimate their total H i mass. deC reases significantly with time between z ~ 3.5 and z = 2 2. 
^ Given the sensitivity of present day r adio telescopes, this tech- 

" nique remains limited to z < 0.2 (e.g. lLah et "aT1 l2007l) . At high Several explanations for the steepening of f(Nm,X) have 
redshift, neutral gas is revealed by the damped Lyman-a absorp- been discussed in the literature, including conversion from 
tion systems (DLAs) it imprints in the optical spectra of bright atomic to molecular hydrogen (IZwaan & Prochaskal [2006). 
background sources such as quasars. Because the detection of small-scale turbulence, or stellar feedback dErkal et al.l l2012h. 
DLAs is only cross-section dependent, it is possible to statis- Selec tion effects such as dust-reddening (e.g. lVladilo & Perouxl 
tically derive the amount of neutral gas and the corresponding |2005l) could also alter the slope of f(N m ,X) m magnitude- 
column density distribution at dif ferent redshifts ind ependently limited surveys. However, the slope of the frequency distribu- 
of the nature of the absorbers (see lWolfe et al.ll2005l) . tion itself is not yet weU constrained at the high-column-density 
The most recent contributions to the census of DLAs used end due to rapidly decreasing statistics. Similarly, the evolu- 
data mining of thousands of quasar spectra from the Sloan tion of Q ^ LA has been lo ng discussed in the literature. Values 

at z ~ 1 (Rao et al. 2006) have been considered un comfortably 

* The catalogue is only available in electronic form at the CDS via high w hen compared to that at z = and z ~ 2 from|Zwaanet al. 

anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via (2005]) and IProchaska et al.l (12005) respectively. However, M)9 



http://cdsweb.u-strasbg.fr/cgi-bin/xxxxxxxxxxxxxxxxxx. corrected upwards the value at z ~ 2. Since then, the value at 
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z — has also been corrected upwards by iBraunl (1201 2l) and 
could indicate a flatter evolution over < z < 2. 

In this letter we present a search for DLAs in quasars ob- 
served in the c ourse of the Baryoni c Oscillation Spectroscopic 
Survey (BOSS, iDawson et al.ll2012[). one of the legacy surveys 
in the third stage of the SDSS (Eisens tein et al.ll201 lb . We use 
the same formalism as described in N09 and adopt a ACDM cos- 
mology with Qa = 73, Q m = 0.27, and H = 70 km s _1 Mpc -1 
dKom atsu et alj |201 lb . 



2. Method 

BOSS is a five-y ear program using imp roved spectrograp hs 
(ISmee et al.ll2012h on the SDSS telescope dGunn et al.ll2006l) to 
obtain spectra of 1.5 million galaxies and over 150,000 z > 2.15 
quasars reaching up to 1 mag deeper than SDSS-II. The survey 
is mainly designed to measure the characteristic scale imprinted 
by baryon acoustic oscillations (BAOs) in the early Universe 
from the spatial distribution of luminous galaxies at z ~ 0.7 and 
the large-scale correlation of Hi absorption lines in the inter- 
galactic medium at z ~ 2.5 (IDawson et al. 1 12012b . BOSS uses 
the same imaging data as in SDSS-I and II with an extension 
in the south galactic c ap ("see lAihara et al.l [201 ll) . The SDSS- 
DR9 dAhn et al.ll2012l) makes publicly available the spectra of 
87,822 quasars ov er an area of 3,275 deg 2 , 65,205 having z > 2 
dParis et a JI2012I) . The q uasar target selec tion is described in 
iRoss etal.N2012l see also lBovv et ai]|2011b . 




Fig. 1. Redshift sensitivity function g(z) of our full DR9 sample 
(dotte d) and statistical sample (black) compared to that of DR7 
dN09l grey). 



metal absorption lines. Finally, A^(H i) is obtained by fitting a 
Voigt profile to the damped Lyman-a line. 

This approach provides us with an overall sample of 12,081 
DLA candidates with logA^(Hi) > 20, out of which 6,839 have 
logAf(Hi) > 20.3 (Table 3, in the electronic version only). We 
also provide values of (or limits on) the equivalent widths of 
associated metal lines redwards of the Ly-a emission line. 

2.2. Statistical sample 



2.1. Detection of DLAs 

Intervening DLAs were searched for automatically in quasar 
spectra following the method described in lN09l We briefly sum- 
marise here the main steps. For the purpose of collecting the 
largest number of DLAsj, we searched the full line-of-sight to 
each quasar starting where the spectral signal-to-noise ratio per 
pixel reaches 2 (defining z m ; n ) and up to the quasar redshift. We 
avoid sight-lines with broad absorp tion lines with balnicity index 
BI>1000 kms ' dParis et aU UoT%. 

The quasar continuum is modelled over the Ly-a forest by 
fitting a modified power-law with a smoothly changing index 
plus Moffat profiles on top of the emission lines. Whenever the 
Ly-a emission line was severely absorbed (>30%), we used the 
predicte d unabsorbed em ission from principal component anal- 
ysis (see lParis et al.ll20TlT) as a proxy for the true Ly-o- emission 
before fitting the continuum. We then use the median continuum- 
to-noise ratio as an estimate of the quality of the spectrum, inde- 
pendent of the presence of a DLA. Spectra with median CNR < 2 
over the Ly-a forest were not further considered. 

Damped absorption lines are recognised through their char- 
acteristic shape by correlating the data against synthetic profiles 
of increasing column density (see N09). In short, (N(H i), z) pairs 
with Spearman's correlation above 0.5 (and significance > 3 <x) 
are recorded. To constrain the strength of the absorption, we also 
impose that the absorbed flux should be consistent with the pres- 
ence of a DLA combined with possible Ly-a forest absorptions. 
The pairs are then grouped into individual DLA candidates (a 
gap of >1000 km s indicates separate absorption systems), and 
the first guess for A^(H i) is taken from the pair with the high- 
est correlation. The DLA redshift measurement is then improved 
whenever possible by cross-correlating the QSO spectrum on the 
red side of the Lyman-a emission line with a mask representing 

1 DLA s are contaminants for the study of the Ly-a forest correlation 
function dFont-Ribera & Miralda- Escude 20H). 



We subsequently define a statistical sub-sample that is used to 
derive the A^(H i) distribution function and the integrated cosmo- 
logical mass density of neutral gas. First of all, we conservatively 
reject al l QSOs with even moderate balnicity (BI> or flagged 
visually Paris et al. 2012) and apply a more stringent threshold 
on the data quality, keeping only spectra with CNR > 3. We then 
restrict the redshift range as follows: i) N09 showed that the pres- 
ence of a DLA near the blue end of the spectrum can bias the 
definition of z m i n and proposed a systematic 10000 kms -1 ve- 
locity shift to z m ; n that we also apply here, ii) We consider only 
the region 3000 kms 'redwards of the Ly-y6 emission line and 
5000 kms 1 bluewards of the Ly-a emission line. The first cut 
ensures that we consider only the Ly-a forest and avoid the Ly- 
/3+0 vi region where associated broad O vi absorption can occur 
(even if no broad Civ absorption is seen) and be mistaken as 
DLAs. Th e second cut avoids DLAs located in the vicinity of the 
QSO (e.g. Ellison et al.l 12002). Finally, we restrict our study to 
therangez e [2, 3.5]. This avoids the very blue end of the spectra 
(below 3650 A) w here reduction problems have been identified 
(Pari s et al .1 12011 . We set the upper limit because the increasing 
density of the Ly-a forest can introduce a significant fraction of 
false DLA identifications due to strong blending of th e Ly-a for- 
est lines at the SDSS resolution dRafelski et al.ll2012b . The value 
of 3.5 corresponds to the redshift out to which this systematic 
can be reliably estimated based on an analysis of mock spectra. 

These cuts leave us with 37,503 lines-of-sight bearing 5,428 
systems with logA^(Hi) > 20 (3,408 bona-fide DLAs with 
log Af(H i) > 20.3). We present in Fig.Q]the sensitivity functions 
g(z) (i.e. the number of lines-of-sight covering a given redshift) 
for the full and statistical samples. Our statistical DR9 sample is 
more than 3 times larger than the overall DR7 sample and also 
extends to lower redshifts, thanks to the improved blue cover- 
age of the SDSS spectrograph. The total absorption path length 
probed by our statistical sample over z = 2 - 3.5 is Ax » 45,000 
with an average redshift (z) = 2.5. 
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2.3. Mocks 

The BOSS collaboration is constantly developing mocks that 
simulate the Lyman-g f orest seen towards BOSS QSOs (e.g. 
iFont-Ribera et al.l |2012). While mocks were principally de- 
signed for BAO studies, the important point for this study is 
that simulated spectra are produced with the same noise and flux 
distributions as in the actual DR9 data (Bailey et al., in prep.). 
Furthermore, DLAs have been introduced to the mocks w ith a 
known distribution (IFont-Ribera & Miralda-Escudell2012l) . We 
also applied our DLA-searching algorithm to 33 realisations of 
3,861 mocks representative of the DR9 data with the same cuts 
as in real data. From this exercise, the completeness and purity 
(1 minus the fraction of false identifications) in the statistical 
sample are both found to be above 95% for logA^(Hi) > 20.3 
(and higher when restricting to higher A^(H i) systems). Overall, 
the automatic procedure systematically overestimates A^(H i) by 
0.03 dex. This is much lower than the dispersion (0.20 dex) 
which corresponds to the typical 1 <x error on log A^(H i). 

3. The column density distribution at (z) = 2.5 

In Fig. [2^, we compare the simulated input distribution of Hi 
column densities (in the range iV(Hl) = 10 20 - 4 x 10 21 ctrT 2 ) 
at z = 2-3 with that recovered from mocks by our procedure 
over the same redshift range. We can see that the overall agree- 
ment is excellent; although our procedure slightly overestimates 
f(Nm,x), particularly at the low column density end. We use 
the difference between the input and output distributions as the 
correction to apply to the observed distribution from real data. 

To ascertain the properties of the high-column-density end 
of f{Nui,x) -where statistics are much smaller- we have visu- 
ally checked all DLA candidates with logA^(Hi) > 21.6. In this 
regime, blind correction using mocks could be more uncertain as 
the corresponding H i fits are based on Ly-ff only while metals 
are systematically detected in the real data. Indeed, we found a 
few cases where two closely-spaced DLAs were mistaken for a 
higher column density one. Disentangling such blends was pos- 
sible thanks to the presence of metal lines. For each DLA candi- 
date with logAf(Hi) > 21.6, the absorption profile was carefully 
refitted manually, improving the continuum determination and 
using metal lines to determine a precise redshift of the absorber. 
The resulting f{Nm,x) at z = 2 - 3 is shown in Fig.[2j3 with val- 
ues given in Table Q] It is apparent that the distribution extends 
beyond 10 22 ctrT 2 with 5 systems with logA^(Hi) > 22 in the 
statistical sample (8 in the full sample). Extrapolating this func- 
tion, we might expect to detect DLAs reaching log A^(H i) = 23 
at the completion of BOSS. 

Following N09, we measure the total amount of neutral gas 
in DLAs at (z) = 2.5 to be Q[? LA « 10~ 3 . Fig. Et represents 
the contribution to the total amount of neutral gas as a function 
of iV(Hl). We confirm lN09i 's result that the largest contribution 
comes from systems with A^(Hi) ~ 10 21 ctrT 2 . However, it is 
interesting that the systems with A^(H i) in excess of 5x 10 21 cm -2 
contribute a non-negligible fraction of Q^ hA (~ 10%), although 
they are rarely represented in most surveys. 

4. Cosmological mass density of neutral gas 

Fig.[3](see also Tabled shows the evolution of the cosmologi- 
cal mass density in DLAs as a function of redshift. Using mock 
spectra, we estimate a correction for systematics (over/under- 
estimate of A^(H i), incompleteness and contribution of false pos- 
itives) as a function of redshift. At high redshift, the correction 



Table 1. N(H i) distribution function at (z) = 2.5 



logjV(Hi) 


log f(N m ,x) 


log/(2Vm,Ar)corr. a 


o-Qogf{Nm,x)t 


[20.00,20. 10[ 


-21.20 


-21.44 


0.02 


[20.10,20.20[ 


-21.37 


-21.47 


0.02 


[20.20,20.30[ 


-21.55 


-21.59 


0.02 


[20.30,20.40[ 


-21.66 


-21.68 


0.02 


[20.40,20.50[ 


-21.81 


-21.82 


0.02 


[20.50,20.60[ 


-21.97 


-21.98 


0.02 


[20.60,20.70[ 


-22.13 


-22.14 


0.03 


[20.70,20.80[ 


-22.30 


-22.32 


0.03 


[20.80,20.90[ 


-22.49 


-22.51 


0.03 


[20.90,21.00[ 


-22.63 


-22.67 


0.03 


[21.00,21. 10[ 


-22.85 


-22.91 


0.04 


[21.10,21.20[ 


-23.04 


-23.11 


0.04 


[21.20,21.30[ 


-23.19 


-23.28 


0.05 


[21.30,21.40[ 


-23.46 


-23.58 


0.06 


[21.40,21.50[ 


-23.66 


-23.81 


0.07 


[21.50,21.60[ 


-23.83 


-24.01 


0.08 


[21.60,21.70[ 


-24.20 


-24.20 


0.08 


[21.70,21. 80[ 


-24.62 


-24.62 


0.12 


[21.80,21.90[ 


-24.85 


-24.85 


0.18 


[21.90,22.00[ 


-25.60 


-25.60 


0.53 


[22.00,22.20[ 


-26.05 


-26.05 


0.53 


[22.20,22.40[ 


-26.25 


-26.25 


0.53 



Notes. <n) Corrected for systematics. {b) Poissonian errors. 



is mostly due to A^(H i) overestimation due to the denser Ly-o- 
forest together with increasing false positive identifications. At 
z < 2.3, the correction is upwards due to higher incomplete- 
ness and slight underestimation of A^(H i). Note that the zero- 
point photometric calibration can be in error by about 5% below 
4,000 A dParis et al.l2012l) . which could differently affect the de- 
tection of DLAs and A^(H i)-measurements in mocks and real 
data. This prob lem will be add ressed by forthcoming versions of 
the pipeline dBolton et al.ll2012l) . 

We obser ve a decrease of Ql? LA from z = 3.5 to z = 2.3 
as in lN09l and lProchaska& Wolfel |2009), although with higher 
values at z < 3.2. This can be explained by the ~10% contri- 
bution of very large column density systems in DR9 and bet- 
ter knowledge of systematics. It is unclear which value of Q]? LA 
should be used at z = 0. The measurement by Braurj d2012l) is 
based on only three galaxies and although the high-column den- 
sity end of f{Nm,X) seems to be well c onstrained, this may not 
be true at low jV( H i) (IZwaan et al.l 2005). Measurements at z ~ 1 
dRao et al.ll2006l) are still indirect and, while direct se arches for 
DLAs at low redshift are possible (Meiring et al. 201 1), they are 
still quite limited in terms of sample size. It appears that system- 
atics dominate over statistical uncertainties across most of the 
redshift range. Keeping this in mind, we can still conclude that 
qDla evo i ves on jy m iidly over the past 12 Gyr. 

5. Conclusion 

We have presented the first results of our ongoing survey for 
DLAs in the SDSS-III Baryon Oscillation Spectroscopic Survey, 
Data Release 9. This represents by far the largest sample of 
DLAs to date (with -12,000 systems with logjV(Hi) > 20) and 
should allow numerous follow-up studies. We expect the sam- 
ple to be increased by a factor larger than two at the comple- 
tion of BOSS. Using a well defined sub-sample, and controlling 
systematics (which dominate over statistical errors) with syn- 
thetic spectra, we derive the H i column density distribution at 
(z) = 2.5 in the range 10 20 - 2 x 10 22 ctrT 2 and characterise 
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Fig. 2. Column density distribution functions from synthetic (left) and real data (centre) at (z) = 2.5. Horizontal bars represent the 
bin over which f{Nm,X) K calculated and vertical error bars represent Poissonian uncertainty. The difference between outp ut and 
input mock distributions is shown at the bottom of panel a. The double p ower- l aw an d T-functio n fits to the DR7 dis tribution (N09, 
(z) = 2.9) are shown as red dashed lines. f(Nm,x)(z - 0) 816 taken from lBraunl(l2012L purple) and lZwaan et alj j 2005[ green). Right: 
The contribution of DLAs in a given N(H i) range to the total mass census of neutral gas. DR9 values are corrected for systematics. 
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Fig. 3. Cosmological mass d ensity of neu t ral ga s in DL As as 
a fun ction of redshift (Z 5: IZwaan et alJ (2005), B12: Braun 
(|2012[) . R06: iRao et alJ (|2006j), PW09: Prochaska & Wolfel 
(2009), DR9: this work). 
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Table 2. Q° LA and DLA incidence (dN/dz) in different redshift 
bins. 



z 


2.0 - 2.3 


2.3 - 2.6 


2.6 - 2.9 


2.9-3.2 


3.2-3.5 


Az 


3690 


4509 


2867 


1620 


769 


A X ' 


11625 


14841 


9900 


5834 


2883 


10 3 Q° LA + 
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1.19/1.04 


1.44/1.10 


1.87/1.27 


10 3 cr(Q.° LA ) * 
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0.04 


0.05 


0.08 


0.13 


dN/dz f 


0.19/0.20 


0.21/0.20 


0.29/0.25 


0.36/0.29 


0.48/0.36 



Notes. w Total absorption pathlength (see Lanz etta et al.1 Il99lh . 
(T) Direct values/corrected for systematics. (T) Statistical uncertainty. 



the evolution of the cosmological mass density of neutral gas in 
DLAs at 2 < z < 3.5. This study should help to constrain models 
of galaxy formation and evolution by measuring the amount of 
neutral gas immediately available to fuel star formation through 
cosmic history. 
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